Speaker-independent connected letter recognition with a multi-state time delay neural network

نویسندگان

  • Hermann Hild
  • Alexander H. Waibel
چکیده

We present a Multi-State Time Del ay Neural Network (MS-TDNN) for speaker-i ndependent, connected l etter recogni ti on. Our MS-TDNNachi eves 98. 5/92.0% word accuracy on speaker dependent/i ndependent Engl i sh l etter tasks[7, 8]. In thi s paper we wi l l summari ze several techni ques to improve (a) conti nuous recogni ti on performance, such as sentence l evel trai ni ng, and (b) phoneti c model i ng, such as network archi tectures wi th \i nternal speaker model s", al l owing for \tuni ng-i n" to newspeakWe al so present resul ts on our l arge and sti l l growing GermanLetter data base, contai ni ng over 40. 000 l etnti nuousl y spel l ed by 55 speakers. European Conference on Speech, Communication and Technology (EUROSPEECH 93), Berl in, Germany, Vol ume 2, pp. 1481-1484 el l ed Letter Recogni ti on, Speaker-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Connected Letter Recognition with a Multi-State Time Delay Neural Network

The Multi-State Time Delay Neural Network (MS-TDNN) integrates a nonlinear time alignment procedure (DTW) and the highaccuracy phoneme spotting capabilities of a TDNN into a connectionist speech recognition system with word-level classification and error backpropagation. We present an MS-TDNN for recognizing continuously spelled letters, a task characterized by a small but highly confusable voc...

متن کامل

Multi-State Time Delay Neural Networks for Continuous Speech Recognition

Alex Waibel Carnegie Mellon University Pittsburgh, PA 15213 [email protected] We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition. Unlike most other hybrid methods. the MS-TDNN embeds an alignment search procedure into the connectionist architecture. and allows for word level supervision. The resulting system has the ability to ma...

متن کامل

Connectionist Architectures for Multi-Speaker Phoneme Recognition

We present a number of Time-Delay Neural Network (TDNN) based architectures for multi-speaker phoneme recognition (/b,d,g/ task). We use speech of two females and four males to compare the performance of the various architectures against a baseline recognition rate of 95.9% for a single IDNN on the six-speaker /b,d,g/ task. This series of modular designs leads to a highly modular multi-network ...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Speaker Independent Vowel Recognition using Backpropagation Neural Network on Master-Slave Architecture

Objective of the work is speaker independent recognition of vowels of British English. Back propagation is one of the simplest and most widely used methods for supervised training of multi layer neural networks. In this paper we use parallel implementation of Backpropagation (BP) on Master – Slave architecture to recognize speaker independent eleven steady state vowels of British English. We pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993